Managing Change in Large-Scale Data Sharing Systems
نویسندگان
چکیده
The problem of sharing data across multiple sources has received considerable attention in recent years because of its relevance to enterprise data management, scientific data management, and information integration on the WWW. However, the management of updates in such systems has received very little attention. In a data sharing system, the set of sources and clients is not fixed, and therefore the sources publishing the updates do not necessarily know exactly who will consume them. Consequently, the system needs to support a variety of update propagation strategies. In this paper, our approach is based on identifying two kinds of objects of interest, which are treated as first-class citizens in the system: updategrams, which are descriptions of updates over base relations, and boosters, which complement updategrams to speed up the processing of join views. We derive a complete set of rules governing the production, combination, and reconciliation of updategrams and boosters over differing time intervals. Our rules cover both GAV and LAV style mediation, as well as views involving certain forms of aggregation. We show how to use our rules to produce efficient query execution plans by extending the System-R style query optimizer, and present experiments that evaluate several heuristics for pruning the search space of plans that use updategrams and views for query evaluation.
منابع مشابه
Centralized Clustering Method To Increase Accuracy In Ontology Matching Systems
Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...
متن کاملAccess control in ultra-large-scale systems using a data-centric middleware
The primary characteristic of an Ultra-Large-Scale (ULS) system is ultra-large size on any related dimension. A ULS system is generally considered as a system-of-systems with heterogeneous nodes and autonomous domains. As the size of a system-of-systems grows, and interoperability demand between sub-systems is increased, achieving more scalable and dynamic access control system becomes an im...
متن کاملIsolation Levels for Data Sharing in Large-Scale Scientific Workflows
Scientists can benefit from Grid and Cloud infrastructures to face the increasing need to share scientific data and execute data-intensive workflows at a large scale. However, these workflows are creating more and more challenging problems in the automation of data management during execution. Existing workflow management systems focus on how data is stored, transfered and on data provenance. H...
متن کاملCoordinated Learning to Support Resource Management in Computational Grids
Managing resources in large scale distributed systems is an important concern for both Peer-2-Peer and Computational Grid systems, and is a complex and time sensitive process. Although existing Peer-2-Peer systems are divided into those that support computation (CPU) sharing or data sharing, users in a Computional Grid generally need to share both. Identifying which resources to select is impor...
متن کاملElectric-vehicle car-sharing in one-way car-sharing systems considering depreciation costs of vehicles and chargers
In recent years, car-sharing systems have been announced as a way to increase mobility and to decrease the number of single-occupant vehicles, congestion, and air pollution in many parts of the world. This study presents a linear programming model to optimize one-way car-sharing systems for electric cars considering the depreciation costs of chargers and vehicles as well as relocation cost of v...
متن کامل